概述與架構演進地圖
我們從AlexNet的基礎性成功,轉向極深層次的 卷積神經網絡(CNN)。這種轉變要求在維持訓練穩定性的同時,對極致深度進行有效處理,因而催生了深刻的架構創新。我們將分析三種具有里程碑意義的架構——VGG、 GoogLeNet(Inception)以及 ResNet——理解每種架構如何解決擴展問題的不同面向,為本課後半部分的嚴謹模型可解釋性奠定基礎。
1. 結構簡潔性:VGG
VGG引入了以極度一致且小尺寸的卷積核(僅使用 3x3卷積濾波器 堆疊)來最大化深度。雖然計算成本高昂,但其結構的一致性證明,通過極少的架構差異實現的純粹深度,是性能提升的主要驅動因素,進一步確立了小感受野的重要性。
2. 計算效率:GoogLeNet(Inception)
GoogLeNet透過優先考慮效率與多尺度特徵提取,回應了VGG的高計算成本。其核心創新在於 Inception模塊,它會執行並行卷積(1x1、3x3、5x5)和池化操作。關鍵在於,它利用 1x1卷積 作為 瓶頸 ,在高成本運算之前大幅減少參數數量與計算複雜度。
關鍵工程挑戰
Question 1
Which architecture emphasized structural uniformity using mostly 3x3 filters to maximize depth?
Question 2
The 1x1 convolution is primarily used in the Inception Module for what fundamental purpose?
Critical Challenge: Vanishing Gradients
Engineering Solutions for Optimization
Explain how ResNet’s identity mapping fundamentally addresses the Vanishing Gradient problem beyond techniques like improved weight initialization or Batch Normalization.
Q1
Describe the mechanism by which the skip connection stabilizes gradient flow during backpropagation.
Solution:
The skip connection introduces an identity term ($+x$) into the output, creating an additive term in the derivative path ($\frac{\partial Loss}{\partial H} = \frac{\partial Loss}{\partial F} + 1$). This term ensures a direct path for the gradient signal to flow backwards, guaranteeing that the upstream weights receive a non-zero, usable gradient signal, regardless of how small the gradients through the residual function $F(x)$ become.
The skip connection introduces an identity term ($+x$) into the output, creating an additive term in the derivative path ($\frac{\partial Loss}{\partial H} = \frac{\partial Loss}{\partial F} + 1$). This term ensures a direct path for the gradient signal to flow backwards, guaranteeing that the upstream weights receive a non-zero, usable gradient signal, regardless of how small the gradients through the residual function $F(x)$ become.